Search CORE

19,488 research outputs found

A measurement of seasonal concentration in tourism

Author: Cisneros-Martínez José David
Publication venue
Publication date: 24/06/2013
Field of study

Council for Hospitality Management Education's (CHME) Annual Research Conference 2013 Queen Margaret University 16th - 17th May 2013Programa de FPU del Ministerio de Educación. Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tec

Repositorio Institucional Universidad de Málaga

Learning relational dynamics of stochastic domains for planning

Author: Alenyà Ribas Guillem
Inoue Katsumi
Martínez Martínez David
Ribeiro Tony
Torras Carme
Publication venue
Publication date: 01/01/2016
Field of study

Probabilistic planners are very flexible tools that can provide good solutions for difficult tasks. However, they rely on a model of the domain, which may be costly to either hand code or automatically learn for complex tasks. We propose a new learning approach that (a) requires only a set of state transitions to learn the model; (b) can cope with uncertainty in the effects; (c) uses a relational representation to generalize over different objects; and (d) in addition to action effects, it can also learn exogenous effects that are not related to any action, e.g., moving objects, endogenous growth and natural development. The proposed learning approach combines a multi-valued variant of inductive logic programming for the generation of candidate models, with an optimization method to select the best set of planning operators to model a problem. Finally, experimental validation is provided that shows improvements over previous work.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Learning relational dynamics of stochastic domains for planning

Author: Alenyà Ribas Guillem
Inoue Katsumi
Martínez Martínez David
Ribeiro Tony
Torras Carme
Publication venue
Publication date: 01/01/2015
Field of study

UPCommons. Portal del coneixement obert de la UPC

INRIA a CCSD electronic archive server

Digital.CSIC

Decentralized Cooperative Stochastic Bandits

Author: Kanade Varun
Martínez-Rubio David
Rebeschini Patrick
Publication venue
Publication date: 01/01/2019
Field of study

We study a decentralized cooperative stochastic multi-armed bandit problem with

K

arms on a network of

N

agents. In our model, the reward distribution of each arm is the same for each agent and rewards are drawn independently across agents and time steps. In each round, each agent chooses an arm to play and subsequently sends a message to her neighbors. The goal is to minimize the overall regret of the entire network. We design a fully decentralized algorithm that uses an accelerated consensus procedure to compute (delayed) estimates of the average of rewards obtained by all the agents for each arm, and then uses an upper confidence bound (UCB) algorithm that accounts for the delay and error of the estimates. We analyze the regret of our algorithm and also provide a lower bound. The regret is bounded by the optimal centralized regret plus a natural and simple term depending on the spectral gap of the communication matrix. Our algorithm is simpler to analyze than those proposed in prior work and it achieves better regret bounds, while requiring less information about the underlying network. It also performs better empirically

arXiv.org e-Print Archive

Oxford University Research Archive